APPLICATIONS OF DOMAIN DECOMPOSITION AND PARTITION OF 
UNITY METHODS IN PHYSICS AND GEOMETRY 



MICHAEL HOLST 



Abstract. We consider a class of adaptive multilevel domain decomposition-like al- 
gorithms, built from a combination of adaptive multilevel finite element, domain de- 
composition, and partition of unity methods. These algorithms have several interesting 
features such as very low communication requirements, and they inherit a simple and 
elegant approximation theory framework from partition of unity methods. They are also 
very easy to use with highly complex sequential adaptive finite element packages, re- 
quiring little or no modification of the underlying sequential finite element software. 
The parallel algorithm can be implemented as a simple loop which starts off a sequen- 
tial local adaptive solve on a collection of processors simultaneously. We first review 
the Partition of Unity Method (PUM) of Babuska and Melenk, and outline the PUM 
approximation theory framework. We then describe a variant we refer to here as the 
Parallel Partition of Unity Method (PPUM), which is a combination of the Partition of 
Unity Method with the parallel adaptive algorithm of Bank and Hoist. We then derive 
two global error estimates for PPUM, by exploiting the PUM analysis framework it in- 
herits, and by employing some recent local estimates of Xu and Zhou. We then discuss 
a duality-based variant of PPUM which is more appropriate for certain applications, and 
we derive a suitable variant of the PPUM approximation theory framework. Our im- 
plementation of PPUM-type algorithms using the FETK and MC software packages is 
described. We then present a short numerical example involving the Einstein constraints 
arising in gravitational wave models. 



In this article we consider a class of adaptive multilevel domain decomposition-like 
algorithms, built from a combination of adaptive multilevel finite element, domain de- 
composition, and partition of unity methods. These algorithms have several interesting 
features such as very low communication requirements, and they inherit a simple and 
elegant approximation theory framework from partition of unity methods. They are also 
very easy to use with highly complex sequential adaptive finite element packages, re- 
quiring little or no modification of the underlying sequential finite element software. 
The parallel algorithm can be implemented as a simple loop which starts off a sequential 
local adaptive solve on a collection of processors simultaneously. 
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We first review the Partition of Unity Method (PUM) of Babuska and Melenk in Sec- 
tion 2, and outline the PUM approximation theory framework. In Section 3, we describe 
a variant we refer to here as the Parallel Partition of Unity Method (PPUM), which is 
a combination of the Partition of Unity Method with the parallel adaptive algorithm 
from [4]. We then derive two global error estimates for PPUM, by exploiting the PUM 
analysis framework it inherits, and by employing some recent local estimates of Xu and 
Zhou [22]. We then discuss a duality-based variant of PPUM in Section 4 which is more 
appropriate for certain applications, and we derive a suitable variant of the PPUM ap- 
proximation theory framework. Our implementation of PPUM-type algorithms using the 
FEtk and MC software packages is described in Section 5. We then present a short 
numerical example in Section 6 involving the Einstein constraints arising in gravitational 
wave models. 



2. The Partition of Unity Method (PUM) of Babuska and Melenk 

We first briefly review the partition of unity method (PUM) of Babuska and Me- 
lenk [1]. Let Q C R d be an open set and let {f^} be an open cover of Q with a bounded 
local overlap property: For all x E Q 9 there exists a constant M such that 

sup{ i | x e ili } < M. (2.1) 

i 

A Lipschitz partition of unity {&} subordinate to the cover {f^} satisfies the following 
five conditions: 

X>(*) = 1, VxGO, (2.2) 

(fc>0), (2.3) 

(2.4) 
(2.5) 

Vi. (2.6) 

Several explicit constructions of partitions of unity satisfying (2.2)-(2.6) exist. The sim- 
plest construction in the case of a polygon Q C R d employs global C° piecewise linear 
finite element basis functions defined on a simplex mesh subdivision S of Q. The {Qi} 
are first built by first constructing a disjoint partitioning {^°} of S using e.g. spectral or 
inertial bisection [4]. Each of the disjoint Q° are extended to define Qi by considering 
all boundary vertices of Q°; all simplices of neighboring j ^ i which are contained 
in the boundary vertex 1 -rings of Q° are added to Q° to form f^. This procedure pro- 
duces the smallest overlap for the {Qi}, such that the properties (2.2)-(2.5) are satisfied 
by the resulting {&} built from the nodal C° piecewise linear finite element basis func- 
tions. Property (2.6) is also satisfied, but Cq will depend on the diameter of the overlap 
simplices. More sophisticated constructions with superior properties are possible; see 
e.g. [8, 19]. 

The partition of unity method (PUM) builds an approximation u ap = J2i ^i v i where 
the Vi are taken from the local approximation spaces: 

v t c c k (n n Qi) c h\q n n t ), vi, (k > o). (2.7) 

The following simple lemma makes possible several useful results. 
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Lemma 2.1. Let w. wi G H 1 ^) with supp Wi C Q n f^. TTien 

X^ii^ii^(^) - m ii^ii^(^)' /c = ' 1 

II y^^llfffefni < ll^ll/ffc(flng,)? ^ = 0,1 

i i 

Proof. The proof follows from (2.1) and (2.2)-(2.6); see [1]. □ 

The basic approximation properties of PUM following from 2.1 are as follows. 

Theorem 2.2 (Babuska and Melenk [1]). If the local spaces V* have the following ap- 
proximation properties: 

\\u - Vi\\ L 2(n nQi) < e (i), Vi, 
||V(u - Vi)|| L 2 (nnn .) < ei(i), Vi, 

f/jOT the following a priori global error estimates hold: 

1/2 

/TT^, / X ^ 9 / .\ I 



1/2 



||v(„-^)||* (n) < ^(E(^y) 2 ^« + ^)) 

Proof This follows from Lemma 2.1 by taking u — u ap = <^(m — ^) and then by 
using = (pi(u — in Lemma 2.1. □ 

Consider now the following linear elliptic problem: 

-V • (dVu) = / in ft, (2 g) 
w = on <9ft, 

where a y G W 1 ' 00 ^), / G L 2 (ft), a^-^Cj > a > 0, V& ^ 0, where ft C is a convex 
polyhedral domain. A weak formulation is: 

Find u G ^o(^) such that <^H 3 v) = 0, V'u G ^(fi), (2.9) 

where 

(F(u),v) = / aVw • Vu dx — fv dx. 

A general Galerkin approximation is the solution to the subspace problem: 

Find u ap G V C ^(ft) s.t. (F(u ap ),v) = 0, Vi; G V C i7<}(ft). (2.10) 

With PUM, the subspace 1/ for the Galerkin approximation is taken to be the globally 
coupled PUM space (cf. [8]): 

V = | v \v = ^ e ^ |> C tf^ft), 



If error estimates are available for the quality of the local solutions produced in the local 
spaces, then the PUM approximation theory framework given in Theorem 2.2 guarantees 
a global solution quality. 
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3. A Parallel Partition of Unity Method (PPUM) 

A new approach to the use of parallel computers with adaptive finite element methods 
was presented recently in [4]. The following variant of the algorithm in [4] is described 
in [9], which we refer to as the Parallel Partition of Unity Method (or PPUM). This 
variant replaces the final global smoothing iteration in [4] with a reconstruction based 
on Babuska and Melenk's original Partition of Unity Method [1], which provides some 
additional approximation theory structure. 
Algorithm (PPUM - Parallel Partition of Unity Method [4, 9]) 

( 1) Discretize and solve the problem using a global coarse mesh. 

(2) Compute a posteriori error estimates using the coarse solution, and 
decompose the mesh to achieve equal error using weighted spectral 
or inertia! bisection. 

(3) Give the entire mesh to a collection of processors, where each pro- 
cessor will perform a completely independent multilevel adaptive 
solve, restricting local refinement to only an assigned portion of the 
domain. The portion of the domain assigned to each processor co- 
incides with one of the domains produced by spectral bisection with 
some overlap (produced by conformity algorithms, or by explicitly 
enforcing substantial overlap). When a processor has reached an 
error tolerance locally, computation stops on that processor. 

(4) Combine the independently produced solutions using a partition of 
unity subordinate to the overlapping subdomains. 

While the PPUM algorithm seems to ignore the global coupling of the elliptic problem, 
recent results on local error estimation [22], as well as some not-so-recent results on 
interior estimates [17], support this as provably good in some sense. The principle idea 
underlying the results in [17, 22] is that while elliptic problems are globally coupled, 
this global coupling is essentially a "low-frequency" coupling, and can be handled on 
the initial mesh which is much coarser than that required for approximation accuracy 
considerations. This idea has been exploited, for example, in [21, 22], and is why the 
construction of a coarse problem in overlapping domain decomposition methods is the 
key to obtaining convergence rates which are independent of the number of subdomains 
(c.f. [20]). An example showing the types of local refinements that occur within each 
subdomain is depicted in Figure 1 . 




Figure 1 . Example showing the types of local refinements created by PPUM. 

To illustrate how PPUM can produce a quality global solution, we will give a global 
error estimate for PPUM solutions. This analysis can also be found in [9]. We can view 
PPUM as building a PUM approximation u pp = ^ ■ fcvi where the V{ are taken from the 
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local spaces: 

V = XiVf c c k (n n a) c H\n n a), Vi, (fc > o), (3.1) 

where is the characteristic function for Q i9 and where 

Vf c C k {Q) C F 1 ^), Vi, (fc > 0). (3.2) 

In PPUM, the global spaces Vf in (3.1)-(3.2) are built from locally enriching an initial 
coarse global space V by locally adapting the finite element mesh on which V is built. 
(This is in contrast to classical overlapping Schwarz methods where local spaces are 
often built through enrichment of Vb by locally adapting the mesh on which Vq is built, 
and then removing the portions of the mesh exterior to the adapted region.) The PUM 
space V is then 



V < v I v 



= < V \ V 



In contrast to the approach in PUM where one seeks a global Galerkin solution in the 
PUM space as in (2.10), the PPUM algorithm described here and in [9] builds a global 
approximation u pp to the solution to (2.9) from decoupled local Galerkin solutions: 



u rm = > (j)jUi = y^]<j>iV%, (3.3) 



where each vf satisfies: 

Find u\ e such that (F(ixf ), ) = 0, Vvf e V?. (3.4) 

We have the following global error estimate for the approximation in (3.3) built 
from (3.4) using the local PPUM parallel algorithm. 

Theorem 3.1. Assume the solution to (2.8) satisfies u e H 1+a (Q), a > 0, that quasi- 
uniform meshes of sizes h and H > h are used for and Q\Q® respectively, and that 
diam(Qi) > 1/Q > Vi. If the local solutions are built from C° piecewise linear finite 
elements, then the global solution u pp in (3.3) produced by Algorithm PPUM satisfies the 
following global error bounds: 



\u — u 



■ppWvM < VPMCov + C 2 H 1+a ) , 



\\V(u-u pp )\\ L 2 {n) < yj2PM(QiC 2 G + CI) (Ci/i a + C 2 H 1+a ) , 
where P number of local spaces Vi. Further, if H < then: 

\\u - u pp \\ L 2 {n) < V PMCco max{Ci, C 2 }h a , 



||V^-^ p )|| L2( n) < ^2PM(Q*C 2 G + C2 ) )max{C 1 ,C 2 }^, 

so that the solution produced by Algorithm PPUM is of optimal order in the ¥P-norm. 

Proof. Viewing PPUM as a PUM gives access to the a priori estimates in Theorem 2.2; 
these require local estimates of the form: 

||^ - Ui\\ L 2(n nQi) = \\u - v%\\ L 2(n nQi ) < e (i), 
||V(w - Ui)\\ L 2( QnQi) = \\V(u - ul)\\ L 2 {QnQi) < ei(i). 
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Such local a priori estimates are available for problems of the form (2.8) [17, 22]. They 
can be shown to take the following form: 

\\u - <||ifi(n,nn) < C f inf \\u - ^H^o^ + \\u - ul\\ L 2 {n) J 

\ V i J 

where 

v° c c k (n° z nn)c H\n t n Q), 

and where 

Since we assume u e iif a > 0, and since quasi-uniform meshes of sizes h and 

H > h are used for and respectively, we have: 

\\u - u 9 i \\ HH n inn) = l\\u - <|||2 (n . nn) + ||V(u - u?)||i2 (n . nn) J 



l+a 



< C 1 h a + C 2 H 

I.e., in this setting we can use e (i) = ei(i) = Ci/i a + C 2 H 1+a . The a priori PUM 
estimates in Theorem 2.2 then become: 



\mn) < ^MC 0O ^(C 1 ^ + C 2 i/ 1+Q ) 2 j , 



\V{u-u pp )\\ L 2 {n) < V2M 

1/2 



4^ v diam (^) / 



2 



l+a\2 



If P = number of local spaces V*, and if diam(fij) > 1/Q > Vi, this is simply: 



\u-u pp \\ L 2 {cl) < VPMC^ (dh a + C 2 H l+a ) , 



\\V(u-u pp )\\ L 2 m < ^PM^Cl + CDidh* + C 2 H 1+a ) . 

If H < then u pp from PPUM is asymptotically as good as a global Galerkin 

solution when the error is measured in the i7 1 -norm. □ 

Local versions of Theorem 3.1 appear in [22] for a variety of related parallel algo- 
rithms. Note that the local estimates in [22] hold more generally for nonlinear versions 
of (2.8), so that Theorem 3.1 can be shown to hold in a more general setting. Finally, it 
should be noted that improving the estimates in the L 2 -norm is not generally possible; 
the required local estimates simply do not hold. Improving the solution quality in the 
L 2 -norm generally requires more global information. However, for some applications 
one is more interested in a quality approximation of the gradient or the energy of the 
solution rather than to the solution itself. 



4. Duality-based PPUM 

We first briefly review a standard approach to the use of duality methods in error 
estimation, (cf. [6, 7] for a more complete discussion). Consider the weak formula- 
tion (2.9) involving a possibly nonlinear differential operator F : Hq(Q) h> 
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and a Galerkin approximation u ap satisfying (2.10). If F £ C 1 , the generalized Taylor 
expansion exists: 

F{u + h) = F(u) + j jf £>F(u + J ft. 

With e = u — u ap , and with F(u) = 0, leads to the linearized error equation: 

F{u ap ) = F(u - e) = F(u) + ^4(^ ap - = -*4e, 
where the linearization operator .4 is defined as: 

A= f DF(u + £h)d£. 
Jo 

Assume now we are interested in a linear functional of the error /(e) = (e, ip), where i(j 
is the (assumed accessible) Riesz-representer of /(•). If (j) G ifo(fi) * s the solution to the 
linearized dual problem: 

A T cj) = ^ 

then we can exploit the linearization operator .4 and its adjoint A T to give the following 
identity: 

(e^) = (e,A T 0) = (Ae,</>) = -(F(u ap ),</>). (4.1) 

If we can compute an approximation c/) ap E V C Hq(Q) to the linearized dual problem 
then we can estimate the error by combining this with the (computable) residual F(u ap )\ 

\(e^)\ = \(F(u ap )^)\ = \(F(u ap ),<l>-<l> ap )l 

where the last term is a result of (2.10). The term on the right is then estimated locally 
using assumptions on the quality of the approximation cj) ap and by various numerical 
techniques; cf. [6]. The local estimates are then used to drive adaptive mesh refinement. 
This type of duality-based error estimation has been shown to be useful for certain ap- 
plications in engineering and other areas where accuracy in a linear functional of the 
solution is important, but accuracy in the solution itself is not (cf. [7]). 

Consider now this type of error estimation in the context of domain decomposition and 
PPUM. Given a linear or nonlinear weak formulation as in (2.9), we are interested in the 
solution u as well as in the error in PPUM approximations u pp as defined in (3.3)-(3.4). If 
a global linear functional l(u — u pp ) of the error u — u pp is of interest rather than the error 
itself, then we can formulate a variant of the PPUM parallel algorithm which has in some 
sense a more general approximation theory framework than that of the previous section. 
There are no assumptions beyond solvability of the local problems and of the global dual 
problems with localized data, and perhaps some minimal smoothness assumptions on 
the dual solution. In particular, the theory does not require local a priori error estimates; 
the local a priori estimates are replaced by solving global dual problem problems with 
localized data, and then incorporating the dual solutions explictly into the a posteriori 
error estimate. As a result, the large overlap assumption needed for the local estimates in 
the proof of Theorem 3.1 is unnecessary. Similarly, the large overlap assumption needed 
to achieve the bounded gradient property (2.6) is no longer needed. 

The following result gives a global bound on a linear functional of the error based on 
satisfying local computable a posteriori bounds involving localized dual problems. 

Theorem 4.1. Let {^} be a partition of unity subordinate to a cover {Qi}. If is 
the Riesz-representer for a linear functional l(u), then the functional of the error in the 
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PPUM approximation u pp from (3.3) satisfies 

v 

l(u-Upp) = -^2(F(u 9 i ),uj i ) } 
k=i 

where u\ are the solutions to the subspace problems in (3.4), and where the u)i are the 
solutions to the following global dual problems with localized data: 

FinduJi G Hq(Q) such that (A T (jj il v) L 2^ ((f>i^^v) L 2^ n ^ \/v G Hq(Q). (4.2) 

Moreover, if the local residual F(uf), weighted by the localized dual solution uj iy satisfies 
the following error tolerance in each subspace: 

\(F(uf),u Ji )\<^t = l,...,p (4.3) 

then the linear functional of the global error u — u pp satisfies 

\l(u-Upp)\ < e. (4.4) 

Proof. With l(u — u pp ) — (u — u PP) ip) the localized representation comes from: 

p p p 

(u - u pp , ^) L2(n) = (^2 & u - = ^2M u - u D^)li{^ % )- 

k=l 2=1 k=l 

From (4.1) and (4.2), each term in the sum can be written in terms of the local residual 
F(v%) as follows: 

{(j>i{u - u\), ^)L2(nnno = ( u ~ u h <t>itl>)i?{sirfoi) 

= (u - v%,A T Ui) L 2(n) 
= (A(u - u?),(Ji) L 2( Q) 

This gives then 

p p 
\(u-u pp ^) L 2 (n) \ <^2\(F(ul)^)\ <J2^ = e. 

k=i k=i p 

□ 

We will make a few additional remarks about the parallel adaptive algorithm which 
arises naturally from Theorem 4.1. Unlike the case in Theorem 3.1, the constants 
and Cq in (2.5) and (2.6) do not impact the error estimate in Theorem 4.1, removing the 
need for the a priori large overlap assumptions. Moreover, local a priori estimates are 
not required either, removing a second separate large overlap assumption that must be 
made to prove results such as Theorem 3.1. Using large overlap of a priori unknown 
size to satisfy the requirements for Theorem 3.1 seems unrealistic for implementations. 
On the other hand, no such a priori assumptions are required to use the result in Theo- 
rem 4.1 as the basis for a parallel adaptive algorithm. One simply solves the local dual 
problems (4.2) on each processor independently, adapts the mesh on each processor in- 
dependently until the computable local error estimate satisfies the tolerance (4.3), which 
then guarantees that the functional of the global error meets the target in (4.4). 

Whether such a duality-based approach will produce an efficient parallel algorithm 
is not at all clear; however, it is at least a mechanism for decomposing the solution to 
an elliptic problem over a number of subdomains. Note that ellipticity is not used in 
Theorem 4.1, so that the approach is also likely reasonable for other classes of PDE. 



APPLICATIONS OF DD AND PUM IN PHYSICS AND GEOMETRY 



9 



These questions, together with a number of related duality-based decomposition algo- 
rithms are examined in more detail in [5]. The analysis in [5] is based on a different 
approach involving estimates of Green function decay rather than through partition of 
unity methods. 

5. Implementation in FEtk and MC 

Our implementations are performed using FEtk and MC (see [9] for a more complete 
discussion of MC and FEtk). MC is the adaptive multilevel finite element software ker- 
nel within FEtk, a large collection of collaboratively developed finite element software 
tools based at UC San Diego (see www . f etk . org). MC is written in ANSI C (as is 
most of FEtk), and is designed to produce highly accurate numerical solutions to non- 
linear covariant elliptic systems of tensor equations on 2- and 3 -manifolds in an optimal 
or nearly-optimal way. MC employs a posteriori error estimation, adaptive simplex sub- 
division, unstructured algebraic multilevel methods, global inexact Newton methods, and 
numerical continuation methods. Several of the features of MC are somewhat unusual, 
allowing for the treatment of very general nonlinear elliptic systems of tensor equations 
on domains with the structure of (Riemannian) 2- and 3-manifolds. Some of these fea- 
tures are: 

• Abstraction of the elliptic system: The elliptic system is defined only through 
a nonlinear weak form over the domain manifold, along with an associated lin- 
earization form, also defined everywhere on the domain manifold (precisely the 
forms (F(u),v) and (DF(u)w, v) in the discussions above). 

• Abstraction of the domain manifold: The domain manifold is specified by giving 
a polyhedral representation of the topology, along with an abstract set of coor- 
dinate labels of the user's interpretation, possibly consisting of multiple charts. 
MC works only with the topology of the domain, the connectivity of the poly- 
hedral representation. The geometry of the domain manifold is provided only 
through the form definitions, which contain the manifold metric information. 

• Dimension independence: Exactly the same code paths in MC are taken for both 
two- and three-dimensional problems (as well as for higher-dimensional prob- 
lems). To achieve this dimension independence, MC employs the simplex as its 
fundamental geometrical object for defining finite element bases. 

As a consequence of the abstract weak form approach to defining the problem, the com- 
plete definition of a complex nonlinear tensor system such as large deformation nonlinear 
elasticity requires writing only a few hundred lines of C to define the two weak forms. 
Changing to a different tensor system (e.g. the example later in the paper involving the 
constraints in the Einstein equations) involves providing only a different definition of the 
forms and a different domain description. 

A datastructure referred to as the ringed-vertex (cf. [9]) is used to represent meshes 
of d-simplices of arbitrary topology. This datastructure is illustrated in Figure 2. The 
ringed-vertex datastructure is similar to the winged-edge, quad-edge, and edge-facet 
datastructures commonly used in the computational geometry community for represent- 
ing 2-manifolds [15], but it can be used more generally to represent arbitrary rf-manifolds, 
d > 2. It maintains a mesh of d-simplices with near minimal storage, yet for shape- 
regular (non-degenerate) meshes, it provides 0(l)-time access to all information neces- 
sary for refinement, un-refinement, and Petrov-Galerkin discretization of a differential 
operator. The ringed- vertex datastructure also allows for dimension independent imple- 
mentations of mesh refinement and mesh manipulation, with one implementation (the 
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Figure 2. Polyhedral manifold representation. The figure on the left 
shows two overlapping polyhedral (vertex) charts consisting of the two 
rings of simplices around two vertices sharing an edge. The region con- 
sisting of the two darkened triangles around the face / is denoted cjf, 
and represents the overlap of the two vertex charts. Polyhedral manifold 
topology is represented by MC using the ringed-vertex (or RIVER) datas- 
tructure. The datastructure is illustrated for a given simplex s in the figure 
on the right; the topology primitives are vertices and d-simplices. The 
collection of the simplices which meet the simplex s at its vertices (which 
then includes those simplices that share faces as well) is denoted as uj s . 



same code path) covering arbitrary dimension d. An interesting feature of this datastruc- 
ture is that the C structures used for vertices, simplices, and edges are all of fixed size, so 
that a fast array-based implementation is possible, as opposed to a less-efficient list-based 
approach commonly taken for finite element implementations on unstructured meshes. A 
detailed description of the ringed- vertex datastructure, along with a complexity analysis 
of various traversal algorithms, can be found in [9]. 

Our modifications to MC to implement PPUM are minimal, and are described in detail 
in [4]. These modifications involve primarily forcing the error indicator to ignore regions 
outside the subdomain assigned to the particular processor. The implementation does not 
form an explicit partition of unity or a final global solution; the solution must be evaluated 
locally by locating the disjoint subdomain containing the physical region of interest, 
and then by using the solution produced by the processor assigned to that particular 
subdomain. Note that forming a global conforming mesh as needed to build a global 
partition of unity is possible even in a very loosely coupled parallel environment, due to 
the deterministic nature of the bisection-based algorithms we use for simplex subdivision 
(see [9]). For example, if bisection by longest edge (supplemented with tie-breaking) is 
used to subdivide any simplex that is refined on any processor, then the progeny types, 
shapes, and configurations can be predicted in a completely determinstic way. If two 
simplices share faces across a subdomain boundary, then they are either compatible (their 
triangular faces exactly match), or one of the simplices has been bisected more times than 
its neighbor. By exchanging only the generation numbers between subdomains, a global 
conforming mesh can be reached using only additional bisection. 
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6. Example 1: The Einstein Constraints in Gravitation 

The evolution of the gravitational field was conjectured by Einstein to be governed by 
twelve coupled first-order hyperbolic equations for the metric of space-time and its time 
derivative, where the evolution is constrained for all time by a coupled four-component 
elliptic system. The theory basically gives what is viewed as the correct interpretation 
of the graviational field as a bending of space and time around matter and energy, as 
opposed to the classical Newtonian view of the gravitational field as analogous to the 
electrostatic field; cf. Figure 3. The four-component elliptic constraint system consists 




Figure 3. Newtonian versus general relativistic explanations of gravi- 
tation: the small mass simply follows a geodesic on the curved surface 
created by the large mass. 



of a nonlinear scalar Hamiltonian constraint, and a linear 3 -vector momentum constraint. 
The evolution and constraint equations, similar in some respects to Maxwell's equations, 
are collectively referred to as the Einstein equations. Solving the constraint equations 
numerically, separately or together with the evolution equations, is currently of great in- 
terest to the physics community due to the recent construction of a new generation of 
gravitational wave detectors (cf. [12, 11] for more detailed discussions of this applica- 
tion). 

Allowing for both Dirichlet and Robin boundary conditions on a 3-manifold M with 
boundary dM = <9 .MU<9i M, as typically the case in black hole and neutron star models 
(cf. [12, 11]), the strong form of the constraints can be written as: 



A0 = - J R0+_(tr^) 2 5 (6.1) 
-l(*A ab + (LW) ab ) 2 r 7 ~ 27rp0- 3 in M, 



h a D a <j) + c0 


= zondiM., 


(6.2) 


4> 


= fondoM, 


(6.3) 


D b {LW) ab 


= -4> 6 D a trK + 8irf in M, 
3 


(6.4) 


{LW) ab n b + C\W b 


= Z a ond 1 M, 


(6.5) 


w a 


= F a ond M, 


(6.6) 



where the following standard notation has been employed: 

A<p = D a D a <p, 

(Lw) ab = b a w h + b b w a - -^ ab b c w c , 

3 

tvK = r b K ab , 
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In the tensor expressions above, there is an implicit sum on all repeated indices in prod- 
ucts, and the covariant derivative with respect to the fixed background metric % b is de- 
noted as D a The remaining symbols in the equations (R, K, *A ab , p, j a , z, Z a , f, F a , 
c, and C%) represent various physical parameters, and are described in detail in [12, 11] 
and the referenences therein. Stating the system as set of tensor equations comes from 
the need to work with domains which generally have the structure of 3 -manifolds rather 
than single open sets in M 3 (cf. [9]). 

Equations (6.1)-(6.6) are known to be well-posed only for certain problem data and 
manifold topologies [16, 13]. Note that if multiple solutions in the form of folds or bi- 
furcations are present in solutions of (6.1)-(6.6) then path-following numerical methods 
will be required for numerical solution [14]. For our purposes here, we select the problem 
data and manifold topology such that the assumptions for the two general well-posedness 
results in [12] hold for (6.1)-(6.6). The assumptions required for the two results in [12] 
are quite weak, and are, for the most part, minimal assumptions beyond those required 
to give a well-defined weak formulation in ZAbased Sobolev spaces. 

In [9], two quasi-optimal a priori error estimates are established for Galerkin approx- 
imations to the solutions to (6.1)-(6.6). These take the form (see Theorems 4.3 and 4.4 



where Vh C H l (M) is e.g. a finite element space. In the case of the momentum con- 
straint, there is a restriction on the size of the elements in the underlying finite element 
mesh for the above results to hold, characterized above by the parameter a h . This restric- 
tion is due to the fact that the result is established through of the Garding inequality result 
due to Schatz [18]. In the case of the Hamiltonian constraint, there are no restrictions on 
the approximation spaces. 

To use MC to calculate the initial bending of space and time around a single massive 
black hole by solving the above constraint equations, we place a spherical object of unit 
radius in space, and infinite space is truncated with an enclosing sphere of radius 100. 
(This outer boundary may be moved further from the object to improve the accuracy 
of boundary condition approximations.) Reasonable choices for the remaining func- 
tions and parameters appearing in the equations are used below to completely specify the 
problem for use as an illustrative numerical example. (More careful examination of the 
various functions and parameters appear in [12], and a number of detailed experiments 
with more physically meaningful data appear in [1 1].) 

We then generate an initial (coarse) mesh of tetrahedra inside the enclosing sphere, 
exterior to the spherical object within the enclosing sphere. The mesh is generated by 
adaptively bisecting an initial mesh consisting of an icosahedron volume filled with tetra- 
hedra. The bisection procedure simply bisects any tetrahedron which touches the surface 
of the small spherical object. When a reasonable approximation to the surface of the 
sphere is obtained, the tetrahedra completely inside the small spherical object are re- 
moved, and the points forming the surface of the small spherical object are projected to 
the spherical surface exactly. This projection involves solving a linear elasticity prob- 
lem, together with the use of a shape-optimization-based smoothing procedure. The 
smoothing procedure locally optimizes the shape measure function described in [9] in an 



in [9]): 




(6.7) 



(6.8) 
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iterative fashion. A much improved binary black hole mesh generator has been devel- 
oped by D. Bernstein; the new mesh generator is described in [11] along with a number 
of more detailed examples using MC. 

The initial coarse mesh is shown in Figure 4, generated using the procedure described 
above, has approximately 30,000 tetrahedral elements and 5,000 vertices. To solve the 
problem on a 4-processor computing cluster using a PPUM-like algorithm, we begin by 
partitioning the domain into four subdomains (shown in Figure 5) with approximately 
equal error using the recursive spectral bisection algorithm described in [4]. The four 
subdomain problems are then solved independently by MC, starting from the complete 
coarse mesh and coarse mesh solution. The mesh is adaptively refined in each subdomain 
until a mesh with roughly 50000 vertices is obtained (yielding subdomains with about 
250000 simplices each). 

The refinement performed by MC is confined primarily to the given region as driven by 
the weighted residual error indicator described in [9], with some refinement into adjacent 
regions due to the closure algorithm which maintains conformity and shape regularity. 
The four problems are solved completely independently by the sequential adaptive soft- 
ware package MC. One component of the solution (the conformal factor 0) of the elliptic 
system is depicted in Figures 6 (the subdomain and subdomain 1 solutions). 

A number of more detailed examples involving the contraints, using more physically 
meaningful data, appear in [1 1]. Application of PPUM to massively parallel simulations 
of microtubules and other extremely large and complex biological structures can be found 
in [3, 2]. The results in [3, 2] demonstrate both good parallel scaling of PPUM as well as 
quality approximation of the gradient of electrostatic potentials (solutions to the Poisson- 
Boltzmann equation; cf. [10]). 




FIGURE 4. Recursize spectral bisection of the single hole domain into 
four subdomains (boundary surfaces of three of the four subdomains are 
shown). 
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